一文了解文件系统的发展:从单机到分布式
The Evolution of File Systems
Thomas Rivera, Hitachi Data Systems
Craig Harmer, April 2011
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
2
2
SNIA Legal Notice
The material contained in this tutorial is copyrighted by the SNIA.
Member companies and individuals may use this material in presentations and
literature under the following conditions:
Any slide or slides used must be reproduced without modification
The SNIA must be acknowledged as source of any material used in the body of any
document containing material from these presentations.
This presentation is a project of the SNIA Education Committee.
Neither the Author nor the Presenter is an attorney and nothing in this presentation
is intended to be nor should be construed as legal advice or opinion. If you need
legal advice or legal opinion please contact an attorney.
The information presented herein represents the Author's personal opinion and
current understanding of the issues involved. The Author, the Presenter, and the
SNIA do not assume any responsibility or liability for damages arising out of any
reliance on or use of this information.
NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK
.
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
3
3
Abstract
The File Systems Evolution
Over time additional file systems appeared focusing on specialized requirements
such as:
data sharing, remote file access, distributed file access, parallel files access, HPC,
archiving, security, etc.
Due to the dramatic growth of unstructured data, files as the basic units for data
containers are morphing into file objects, providing more semantics and feature-
rich capabilities for content processing
This presentation will:
Categorize and explain the basic principles of currently available file system
architectures (e.g. Local, Shared, SAN, Clustered, Network, Distributed, Parallel, etc.
Explain technologies like Scale-Out NAS, NAS Aggregation, NAS Virtualization, NAS
Clustering, Global Namespace, Parallel NFS
Review new file system architectures being developed
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
4
Related Tutorials
Check out SNIA Tutorial:
Using File Server Protocols for
Block-based Storage Workloads
Check out SNIA Tutorial:
Understanding Enterprise NAS
Check out SNIA Tutorial:
pNFS and NFS V4.2
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
5
Why File Systems Have Evolved
Scale
Megabytes → Petabytes
Requirements
High availability
Data sharing
Remote access
Performance
Archiving
others…
(Not a strict timeline—new capabilities are generally incremental)
?
Time
.....
Network
File
System
Cluster
File
System
SAN
File
System
Shared
File
System
Local
File
System
Parallel
File
System
Object
File
System
Distributed
File
System
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
6
Where File Systems Live
File System
User space
Kernel space
mmap()
User Application and Libraries (ls, mv, rm, cp, ...)
Process Management
Memory
Mgmt
Scheduler IPC
Data Cache*
Segmap Cache
Volume Manager
System Calls (open(), close(), read(), write(), ioctl() , mmap(), ...)
DMA
VFS
Device Drivers
Buffers
*can be
bypassed by using
direct I/O
Machine dependent code
Hardware
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
7
What File Systems Do
(UNIX example)
Data Blocks
data block
data block
data block
data block
data block
data block
data block
data block
data block
data block
data block
data block
data block
Host
direct 0
direct 1
direct 2
direct 3
direct 4
direct 5
direct 6
direct 7
direct 8
direct 9
single
indirect
double
indirect
triple
indirect
File Owner
File Type
Permissions
Last Access
Size
# of links
.
.
.
File attributes:
Inode
0 1 2 3 4
5
6
7
8 9
10 11 12 13
14
15 16 17 18 19
File locators:
(“inodes”)
Data locators:
(pointers)
Data:
(blocks)
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
8
A File System Taxonomy
Local
File System
Shared
File System
SAN
File System
Cluster
File System
Network
File System
Distributed
File System
Distributed
Parallel
File System
File
Systems
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
9
Local File System
File system is co-located in the server with application
Local file system
Application
File System
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
10
Local File System
Separate “islands” of data
Limitation: no data sharing
Application
File System
Application Application Application
File System File System File System
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
11
One Way to Share Data:
Scale-Up
Vertical scaling
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
12
Another Way to Share Data:
Scale-Out
Shared
Data
Horizontal
Scaling
...
Storage Network
Shared Device:
A multi-LUN device shared among clients
Each client has exclusive access to a dedicated LUN
≠
Shared Data:
A physical device shared among clients
Clients access LUNs concurrently
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
13
Data Access with
Shared/Global File System
Separate logical and physical placement
Metadata server
File access is a three-step transaction...
Step 1:Request
access
Metadata
Server
Client
Step 2: Metadata
delivery
MDS
Client
Step 3: Data
access
MDS
Client
Metadata
Server
Metadata
Server
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
14
Shared/Global File System
Asymmetric (“SAN File System”)
Shared
Data
One active metadata server
Typically homogeneous (scaling limited by metadata server capacity)
Inter-node distance limited by storage network capability
Storage Network
Client Network
Application Server Application Server Application Server Application Server Application Server
Application
e.g. Web Server
Application
e.g. Web Server
Application
e.g. Web Server
Metadata Server
(active)
Metadata Server
(passive)
Data Server
Data Server Data Server
Application
e.g. Web Server
Application
e.g. Web Server
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
15
Shared/Global File System
Symmetric (“Cluster File System”)
Shared
Data
Storage Network
Metadata server in each node
Typically homogeneous (scaling limited by internal communication, e.g., distributed locking)
Inter-node distance limited by storage network capability
Client Network
Application Server Application Server Application Server Application Server Application Server
Application
e.g. Web Server
Application
e.g. Web Server
Application
e.g. Web Server
Metadata Server
(active)
Metadata Server
(active)
Data Server Data Server Data Server
Application
e.g. Web Server
Application
(e.g. Web Server)
Data Server Data Server
Metadata Server
(active)
Metadata Server
(active)
Metadata Server
(active)
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
16
Network File Systems
(aka Proxy File Systems)
Enables sharing of files located on a file server among one or more client
computers using a network protocol
Local File System
Application
File System
Application
File System
Client
File System
Server
Application
File System
Client
Application
File System
Client
Application
File System
Client
Network Protocol*
* e.g. NFS, CIFS, AFP,
WebDAV, FTP, HTTP, ...
Network File System
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
17
Network File System “Stack”
(Example: Sun’s NFS)
Data
SCSI Port
Volume Mgr
SCSI Driver
SCSI HBA
File System
Application
NFS
Client
Ethernet
NIC
TCP/IP
RPC/XDR
NFS
Server
Ethernet
NIC
TCP/IP
RPC/XDR
LAN
SAN
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
18
Wide Area Network File Systems
Consolidation eases
Management
Administration
Cost
Compliance
Global file sharing and collaboration
Location consolidation and optimization
But: WAN performance is low compared to LAN/SAN performance
Application
NFS
Client
Ethernet NIC
TCP/IP
RPC/XDR
Data
SCSI Port
Volume Mgr
SCSI Driver
SCSI HBA
File System
NFS
Server
Ethernet NIC
TCP/IP
RPC/XDR
WAN
SAN
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
19
Improving Wide Area File System
Performance
Data
Application
NFS/CIFS
Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS/CIFS
Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS/CIFS
Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS/CIFS
Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS
Client
Ethernet NIC
TCP/IP
RPC/XDR
Application-specific optimizations: email, document management, SQL, ...
Protocol-specific optimizations: HTTP, NFS, CIFS, WebDAV, FTP, TCP/IP, ...
Transport acceleration: TCP accelerators
Intelligent caching: read-ahead, deferred write, coherency, ...
Data compression: algorithms, file-aware differencing, data aggregation,
I/O clustering, chunk based de-duplication, cross-protocol data reduction, ...
SCSI Port
Volume Mgr
SCSI Driver
SCSI HBA
File System
NFS
Server
Ethernet NIC
TCP/IP
RPC/XDR
SAN
Compression Engine
Ethernet
NIC
TCP/IP
Ethernet
NIC
TCP/IP
Compression Engine
Ethernet
NIC
TCP/IP
Ethernet
NIC
TCP/IP
LAN
WAN LAN
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
20
Distributed File System (DFS)
/c /b /a
A network file system with files distributed among multiple file servers
Not a parallel file system
Application
File System
Client
File System
Server
File System
Server
File System
Server
Network
Protocol
Single File System
/
/a /b /c
client
view:
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
21
Distributed Parallel File System
Client
File
Aggregation of Storage Servers
RAIN + RAID
(aka Network RAID)
Global Namespace
Segments of files distributed across storage nodes
Enables parallel I/O to individual files (aka file striping)
File
Server
File
Server
File
Server
File
Server
File
Server
Client Client Client
Network Protocol
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
22
NAS Aggregation
In-Band Solution
Sometimes called
“NAS Router”
IP Network
NAS Router
Global Namespace
SAN
File Server
Data
SAN
File Server
Data
SAN
File Server
Data
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
23
NAS Virtualization - Out-of-Band
Client Client Client Client
Metadata
Server
(MDS)
Global Namespace
File
Server
Individual files / file segments
pinned to file servers
Files can be distributed and/or
replicated for parallel access
Files can be striped for intra-file parallel
access
Clients must locate the right file server
e.g. NFSv4.1 (pNFS), Microsoft’s DFS
distributed
files
striped
files
replicated
files
IP Network
File_A File_G File_B File_D
File_F File_H File_C File_E
File_K_1 File_K_2 File_K_3 File_K_4
File_A’ File_B’’ File_C’ File_B’
File
Server
File
Server
File
Server
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
24
NAS Virtualization – NFS4.1 pNFS
Application Server
IP
In-Band NAS:
IP
Out-of-Band NAS:
Application Server Application Server
SAN SAN
Data
NAS Appliance
Data
NAS Appliance
with NFSv4.1
pNFS extensions
Storage Protocols:
Block: FCP, iSCSI, SRP, SAS
File: NFSv4.1
Object: OSD
Data path decoupled from
control and metadata path
Application Server
NFSv4 client
Application Server
NFSv4 client
Application Server
NFSv4 client
Application Server
NFSv4.1 client
with pNFS
Application Server
NFSv4.1 client
with pNFS
Application Server
NFSv4.1 client
with pNFS
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
25
Toward “Storage Grids”
via NAS
NFS
Clustered Data Services
CIFS
HTTP
FTP
WebDAV
Each file pinned to a single server...
IP
VIP Address
NFS
CIFS
Data Services
Local Files System
Classic Filer
VIP Address
Clustered Data Services
Cluster (Parallel) File System
NFS
CIFS
HTTP
FTP
WebDAV
All nodes serve all files...
Tw o variants:
Client
Client
Client
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
26
Cloud:
The New Grid
NAS Cluster is effectively a storage cloud
Clients
Storage Cloud
Clients
Clients
Clients
File Server
File Server
File Server
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
27
Data Segmentation
Media production,
eCAD, mCAD, Office docs
Media-archive, DAM,
Broadcast,
Medical imaging, Media-
Internet
Transactional systems, ERP,
CRM
BI, Data warehousing,
Scientific,
Transaction archive
Fixed Data Dynamic Data
Structured
Unstructured
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
28
The New Reality of Data Segmentation
Media production,
eCAD, mCAD, Office docs
Media-archive, DAM,
Broadcast,
medical imaging, Media-
Internet
Transactional systems, ERP,
CRM
BI, data warehousing,
scientific,
transaction archive
Fixed Data Dynamic Data
Structured
Unstructured
Semi
Structured*
*Semi-Structured Data contains dynamic meta-data defined by users and/or applications
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
29
Traditional Files
Owner, permissions, type, last modification, ...
Data
Metadata
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
30
Semi-Structured Data
Object ID
Data
Metadata
Attributes
User/application defined
Policies
e.g., Replication
Methods
e.g., Encryption
Owner, permissions, type, last modification, ...
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
31
The File Object Model
Data Blocks
Object
Object
Object
Object
Object
Inode
Name OID
Name OID
Name OID
Name OID
Name OID
Store
Data OID
Retrieve
OID Data
User/application defined
e.g., Replication
e.g., Encryption
Owner, permissions, type, last modification, ...
Object ID
Data
Metadata
Attributes
Policies
Methods
Object Object
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
32
Managing File Objects
File objects can be managed like records in a relational database with user
data as Binary Large Objects (BLOBs)
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Database Schema
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
33
Managing File Objects (Cont.)
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Indexes
constraints/relationships
Object search
Full text search
Join operations
Virtual views
SQL-like requests
Cursors
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
34
Data Serving Hierarchy
3 Levels of Abstraction
Application may interface with
the storage subsystem in any of
three layers:
Block – highest performance
and very little meta data
File – high performance and
some metadata
Object – medium performance
and rich metadata
Many to One
Many to One
Data Server Platform
Application
Object
File
Block
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
35
35
Attribution & Feedback
Please send any questions or comments regarding this SNIA Tutorial to
tracktutorials@snia.org
The SNIA Education Committee would like to thank the
following individuals for their contributions to this Tutorial.
Authorship History
Original Author : Christian Bandulet
Updates:
Thomas Rivera, September 2012
Paul Massiglia , Spring 2012
Craig Harmer, April 2011
Additional Contributors
Craig Harmer
Paul Massiglia
Joseph White
Thomas Rivera
Christian Bandulet
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
36
Appendix
Reference Material
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
37
www.wikipedia.org
ADFS – Acorn's Advanced Disc filing system, successor to DFS
BFS – the Be File System used on BeOS
EFS – Encrypted filesystem, An extension of NTFS
EFS (IRIX) – an older block filing system under IRIX
Ext – Extended filesystem, designed for Linux system
Ext2 – Second extended filesystem, designed for Linux systems
Ext3 – Name for the journalled form of ext2
FAT – Used on DOS and Microsoft Windows, 12, 16 and 32 bit table depths
FFS (Amiga) – Fast File System, used on Amiga systems. This FS has evolved over time. Now
counts FFS1, FFS Intl, FFS DCache, FFS2
FFS – Fast File System, used on *BSD systems
Fossil – Plan 9 from Bell Labs snapshot archival file system
Files-11 – OpenVMS filesystem
GCR – Group Code Recording, a floppy disk data encoding format used by the Apple II and
Commodore Business Machines in the 5¼" disk drives for their 8-bit computers
HFS – Hierarchical File System, used on older Mac OS systems
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
38
www.wikipedia.org (cont'd)
HFS Plus – Updated version of HFS used on newer Mac OS systems
HPFS – High Performance Filesystem, used on OS/2
ISO 9660 – Used on CD-ROM and DVD-ROM discs
(Rock Ridge and Joliet are extensions to this)
JFS – IBM Journaling Filesystem, provided in Linux, OS/2, and AIX
LFS – 4.4BSD implementation of a log-structured file system
MFS – Macintosh File System, used on early Mac OS systems
Minix file system – Used on Minix systems
NTFS – Used on Windows NT, Windows 2000, Windows XP and Windows Server 2003 systems
NSS – Novell Storage Services. This is a new 64-bit journaling filesystem using a balanced tree
algorithm. Used in NetWare versions 5.0-up and recently ported to Linux.
OFS – Old File System, on Amiga. Nice for floppies, but fairly useless on hard drives
PFS – and PFS2, PFS3, etc. Technically interesting filesystem available for the Amiga, performs
very well under a lot of circumstances. Very simple and elegant
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
39
www.wikipedia.org (cont'd)
ReiserFS – Filesystem that uses journaling
Reiser4 – Filesystem that uses journaling, newest version of ReiserFS
SFS – Smart File System, journaled file system available for the Amiga platforms
UDF – Packet based filesystem for WORM/RW media such as CD-RW and DVD.
UFS – Unix Filesystem, used on older BSD systems
UFS2 – Unix Filesystem, used on newer BSD systems
UMSDOS – FAT filesystem extended to store permissions and metadata, used for Linux
VxFS – Veritas file system, first commercial journaling file system; HP-UX, Solaris, Linux, AIX
VSAM
WAFL – Used on Network Appliance systems
XFS – Used on SGI IRIX and Linux systems
ZFS – Used on Solaris
SAM QFS (Oracle)
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
40
www.wikipedia.org (cont'd)
9P The Plan 9 and Inferno distributed file system
AFS (Andrew File System)
AppleShare
Arla (file system)
Coda
CXFS (Clustered XFS) a distributed networked file system designed by Silicon Graphics (SGI)
specifically to be used in a SAN
Distributed File System (DCE)
Distributed File System (Microsoft)
Freenet
Global File System (GFS)
Google File System (GFS)
IBRIX Fusion™
InterMezzo
Isilon OneFS™
Lustre (Oracle)
The Evolution of File Systems
© 2012 Storage Networking Industry Association. All Rights Reserved.
41
NFS
OpenAFS
Server message block (SMB) (aka Common Internet File System (CIFS) or
Samba file system)
Xsan (a storage area network (SAN) filesystem from Apple Computer, Inc.)
archfs (archive)
cdfs (reading and writing of CDs)
cfs (caching)
Davfs2 (WebDAV)
Devfs
ftpfs (ftp access)
fuse (filesystem in userspace, like lufs but better maintained)
GPFS an IBM cluster file system
JFFS/JFFS2 (filesystems designed specifically for flash devices)
LUFS ( replace ftpfs, ftp ssh ... access)
nntpfs (netnews)
OCFS (Oracle Cluster File System)
www.wikipedia.org (cont'd)